\documentclass[a4paper,12pt,notitlepage]{article}
\usepackage{amsmath,amssymb,amsfonts,amsthm}
\usepackage[english]{babel}
\usepackage[left=3cm,right=2cm,top=2.5cm,bottom=2.5cm]{geometry}

\theoremstyle{plain}
\newtheorem{theorem}{Theorem}
\newtheorem*{observation}{Observation}

\title {On the Number of Distinct Subpalindromes in Words}

\author
{ 
	Mikhail Rubinchik and Arseny M. Shur\\
	Ural Federal University\\
	Ekaterinburg, Russia
}

\date{}

\begin{document}
\maketitle

Palindromes are among the most important and actively studied repetitions in words. Recall that a word $w=a_1\cdots a_n$ is a palindrome if $a_1\cdots a_n=a_n\cdots a_1$. In particular, all letters are palindromes; the empty word is also considered as a palindrome, but below we do not count it. A group of combinatorial problems concerns the possible number of distinct palindromic factors, or subpalindromes, in a word. We call this number \emph{palindromic richness}. 

Clearly, for the words containing $k$ different letters the lower bound for their palindromic richness is $k$. If $k>2$, then this bound is sharp, since the infinite word $(a_1\cdots a_k)^\omega$, where $a_1,\ldots, a_k$ are different letters, has no subpalindromes except letters. For $k=2$ the situation is less obvious: the minimum richness of an infinite word is 8, and the minimum richness of an \emph{aperiodic} infinite word is 10 \cite{FiZa13}. On the other hand, the maximum richness of an $n$-letter word over any alphabet is $n$, as was first observed in \cite{DJP01}. Such ``rich'' words are objects of intensive study (see, e.g., \cite{GJWZ09}). 

However, the extremums mentioned above give no clue about the generic case.

\begin{observation}
Any number between 8 and $N$ in the binary case, and between $k$ and $N$ in the $k$-ary case with $k>2$ is the palindromic richness of a word of length $N$.
\end{observation}

So, the following question is quite natural:
\begin{center}
\emph{what is the expected palindromic richness of a random word of length $N$?}
\end{center}
We studied this question using both theory and numerical experiments. Our main theoretic result is the following

\begin{theorem}
For any fixed alphabet $\Sigma$, the expected palindromic richness of a random word of length $N$ over $\Sigma$ is $\Theta(\sqrt{N})$.
\end{theorem}

Note that the expected total number of nontrivial subpalindromes in a random word is $\Theta(N)$, but the constant drops quickly as the alphabet grows. In a contrast with that, the constant in the $\Theta$-expression from Theorem~1 even grows with $k=|\Sigma|$. More precisely, we proved that this constant $C(k)$
\begin{itemize}
\item tends to an absolute constant as $k\to\infty$ if $N$ is close to an even power of $k$;
\item grows as $\sqrt{k}$ if $N$ is close to an odd power of $k$;
\item stays in between the above extremal values in the remaining cases.
\end{itemize}

We also performed some computational experiments based on the linear-time algorithm for counting distinct subpalindromes in a word \cite{KRS13}. By averaging the data obtained for groups of random words we derive the following estimations for $C(k)$:
$$
\begin{array}{|l|l|l|}
\hline
k&C(k) \text{ for even powers}&C(k) \text{ for odd powers}\\
\hline
2& 6.129 \text{ for } N=2^{16}&6.164 \text{ for } N=2^{17}\\
3& 4.393 \text{ for } N=3^{12}&4.408 \text{ for } N=3^{13}\\
10& 3.023 \text{ for } N=10^{6}&3.388 \text{ for } N=10^{7}\\
50& 2.702 \text{ for } N=50^{4}&5.038 \text{ for } N=50^{3}\\
\hline
\end{array}
$$
The figures in the central column decrease but seem to have a limit about $2.5$.  The figures in the rightmost column demonstrate an initial decrement but then grow back to follow the theoretical bound of $\Theta(\sqrt{k})$. All these figures are nicely predicted by a very simple probabilistic model in which all events of the form ``to contain a fixed subpalindrome $u$ of length $p$'' are assumed independent and equiprobable. This is worth noting because these events in fact are dependent and have different probabilities.


%\bibliographystyle{plain}          
%\bibliography{my_bib}            

\newcommand{\noopsort}[1]{} \newcommand{\singleletter}[1]{#1}
\begin{thebibliography}{1}

\bibitem{DJP01}
X.~Droubay, J.~Justin, and G.~Pirillo.
\newblock Episturmian words and some constructions of de {Luca} and {Rauzy}.
\newblock {\em Theoret. Comput. Sci.}, 255:539--553, 2001.

\bibitem{FiZa13}
G. Fici and L.Q. Zamboni.
\newblock On the least number of palindromes contained in an infinite word.
\newblock {\em Theoret. Comput. Sci.}, 481:1--8, 2013.

\bibitem{GJWZ09}
A.~Glen, J.~Justin, S.~Widmer, and L.Q. Zamboni.
\newblock Palindromic richness.
\newblock {\em European J. Combinatorics}, 30(2):510--531, 2009.

\bibitem{KRS13}
D.~Kosolobov, M.~Rubinchik, and A.~M. Shur.
\newblock Finding distinct subpalindromes online.
\newblock In {\em Proc. Prague Stringology Conference. PSC 2013}, pages 63--69.
  Czech Technical University in Prague, 2013.

\end{thebibliography}


\end{document}
